Install & Load Packages

## Installing package into 'C:/Users/Lenovo/AppData/Local/R/win-library/4.2'
## (as 'lib' is unspecified)
## package 'tidyverse' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\Users\Lenovo\AppData\Local\Temp\RtmpExceyg\downloaded_packages
## Installing package into 'C:/Users/Lenovo/AppData/Local/R/win-library/4.2'
## (as 'lib' is unspecified)
## package 'ggplot2' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\Users\Lenovo\AppData\Local\Temp\RtmpExceyg\downloaded_packages
## Installing package into 'C:/Users/Lenovo/AppData/Local/R/win-library/4.2'
## (as 'lib' is unspecified)
## package 'plotly' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\Users\Lenovo\AppData\Local\Temp\RtmpExceyg\downloaded_packages
## Installing package into 'C:/Users/Lenovo/AppData/Local/R/win-library/4.2'
## (as 'lib' is unspecified)
## package 'dplyr' successfully unpacked and MD5 sums checked
## Warning: cannot remove prior installation of package 'dplyr'
## Warning in file.copy(savedcopy, lib, recursive = TRUE): problem copying C:
## \Users\Lenovo\AppData\Local\R\win-library\4.2\00LOCK\dplyr\libs\x64\dplyr.dll
## to C:\Users\Lenovo\AppData\Local\R\win-library\4.2\dplyr\libs\x64\dplyr.dll:
## Permission denied
## Warning: restored 'dplyr'
## 
## The downloaded binary packages are in
##  C:\Users\Lenovo\AppData\Local\Temp\RtmpExceyg\downloaded_packages
## Installing package into 'C:/Users/Lenovo/AppData/Local/R/win-library/4.2'
## (as 'lib' is unspecified)
## package 'skimr' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\Users\Lenovo\AppData\Local\Temp\RtmpExceyg\downloaded_packages
## ── Attaching packages
## ───────────────────────────────────────
## tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6      ✔ purrr   0.3.4 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.2      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## 
## Attaching package: 'plotly'
## 
## 
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## 
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## 
## The following object is masked from 'package:graphics':
## 
##     layout
## 
## 
## 
## Attaching package: 'lubridate'
## 
## 
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union

Importing Data sets

## Rows: 940 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): ActivityDate
## dbl (14): Id, TotalSteps, TotalDistance, TrackerDistance, LoggedActivitiesDi...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Examining Data

## [1] 940  15
## [1] 413   5
## [1] 67  8

Checking for Null & Duplicate values

## [1] 0
## [1] 0
## [1] 65
##             Id           Date       WeightKg   WeightPounds            Fat 
##              0              0              0              0             65 
##            BMI IsManualReport          LogId 
##              0              0              0
##           Id                  Date WeightKg WeightPounds   BMI IsManualReport
## 1 1503960366  5/2/2016 11:59:59 PM     52.6     115.9631 22.65           TRUE
## 2 1503960366  5/3/2016 11:59:59 PM     52.6     115.9631 22.65           TRUE
## 3 1927972279  4/13/2016 1:08:52 AM    133.5     294.3171 47.54          FALSE
## 4 2873212765 4/21/2016 11:59:59 PM     56.7     125.0021 21.45           TRUE
## 5 2873212765 5/12/2016 11:59:59 PM     57.3     126.3249 21.69           TRUE
## 6 4319703577 4/17/2016 11:59:59 PM     72.4     159.6147 27.45           TRUE
##          LogId
## 1 1.462234e+12
## 2 1.462320e+12
## 3 1.460510e+12
## 4 1.461283e+12
## 5 1.463098e+12
## 6 1.460938e+12
## [1] 0
## [1] 3
## [1] 0

Extracting days from date

## tibble [940 × 16] (S3: tbl_df/tbl/data.frame)
##  $ Id                      : num [1:940] 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
##  $ ActivityDate            : chr [1:940] "4/12/2016" "4/13/2016" "4/14/2016" "4/15/2016" ...
##  $ TotalSteps              : num [1:940] 13162 10735 10460 9762 12669 ...
##  $ TotalDistance           : num [1:940] 8.5 6.97 6.74 6.28 8.16 ...
##  $ TrackerDistance         : num [1:940] 8.5 6.97 6.74 6.28 8.16 ...
##  $ LoggedActivitiesDistance: num [1:940] 0 0 0 0 0 0 0 0 0 0 ...
##  $ VeryActiveDistance      : num [1:940] 1.88 1.57 2.44 2.14 2.71 ...
##  $ ModeratelyActiveDistance: num [1:940] 0.55 0.69 0.4 1.26 0.41 ...
##  $ LightActiveDistance     : num [1:940] 6.06 4.71 3.91 2.83 5.04 ...
##  $ SedentaryActiveDistance : num [1:940] 0 0 0 0 0 0 0 0 0 0 ...
##  $ VeryActiveMinutes       : num [1:940] 25 21 30 29 36 38 42 50 28 19 ...
##  $ FairlyActiveMinutes     : num [1:940] 13 19 11 34 10 20 16 31 12 8 ...
##  $ LightlyActiveMinutes    : num [1:940] 328 217 181 209 221 164 233 264 205 211 ...
##  $ SedentaryMinutes        : num [1:940] 728 776 1218 726 773 ...
##  $ Calories                : num [1:940] 1985 1797 1776 1745 1863 ...
##  $ Weekday                 : chr [1:940] "Tuesday" "Wednesday" "Thursday" "Friday" ...

Unique users

## [1] 33
## [1] 24
## [1] 8

Manual Recordings

## # A tibble: 5 × 2
##           Id `Manual Weight Report`
##        <dbl>                  <int>
## 1 1503960366                      2
## 2 2873212765                      2
## 3 4319703577                      2
## 4 4558609924                      5
## 5 6962181067                     30

Merging Data Sets

##            Id ActivityDate TotalSteps TotalDistance TrackerDistance
## 13 1503960366     5/9/2016      12022          7.72            7.72
## 14 1503960366     5/9/2016      12022          7.72            7.72
## 15 1503960366     5/9/2016      12022          7.72            7.72
## 16 1503960366     5/9/2016      12022          7.72            7.72
## 17 1503960366     5/9/2016      12022          7.72            7.72
## 18 1503960366     5/9/2016      12022          7.72            7.72
##    LoggedActivitiesDistance VeryActiveDistance ModeratelyActiveDistance
## 13                        0               3.45                     0.53
## 14                        0               3.45                     0.53
## 15                        0               3.45                     0.53
## 16                        0               3.45                     0.53
## 17                        0               3.45                     0.53
## 18                        0               3.45                     0.53
##    LightActiveDistance SedentaryActiveDistance VeryActiveMinutes
## 13                3.74                       0                46
## 14                3.74                       0                46
## 15                3.74                       0                46
## 16                3.74                       0                46
## 17                3.74                       0                46
## 18                3.74                       0                46
##    FairlyActiveMinutes LightlyActiveMinutes SedentaryMinutes Calories Weekday
## 13                  11                  206              835     1819  Monday
## 14                  11                  206              835     1819  Monday
## 15                  11                  206              835     1819  Monday
## 16                  11                  206              835     1819  Monday
## 17                  11                  206              835     1819  Monday
## 18                  11                  206              835     1819  Monday
##                 SleepDay TotalSleepRecords TotalMinutesAsleep TotalTimeInBed
## 13 4/23/2016 12:00:00 AM                 1                361            384
## 14 4/23/2016 12:00:00 AM                 1                361            384
## 15 4/24/2016 12:00:00 AM                 1                430            449
## 16 4/24/2016 12:00:00 AM                 1                430            449
## 17 4/25/2016 12:00:00 AM                 1                277            323
## 18 4/25/2016 12:00:00 AM                 1                277            323
##                    Date WeightKg WeightPounds   BMI IsManualReport        LogId
## 13 5/3/2016 11:59:59 PM     52.6     115.9631 22.65           TRUE 1.462320e+12
## 14 5/2/2016 11:59:59 PM     52.6     115.9631 22.65           TRUE 1.462234e+12
## 15 5/3/2016 11:59:59 PM     52.6     115.9631 22.65           TRUE 1.462320e+12
## 16 5/2/2016 11:59:59 PM     52.6     115.9631 22.65           TRUE 1.462234e+12
## 17 5/3/2016 11:59:59 PM     52.6     115.9631 22.65           TRUE 1.462320e+12
## 18 5/2/2016 11:59:59 PM     52.6     115.9631 22.65           TRUE 1.462234e+12

Data Recorded during the week

From the bar graph the data is greatest from Tuesday to Thursday. Monday and Friday are both weekdays, why isn’t the data recordings as much as the other weekdays?

Total Minutes Asleep during the week

Summary

##       Weekday       TotalSteps    TotalDistance   VeryActiveMinutes
##  Monday   :4352   Min.   :    0   Min.   : 0.00   Min.   :  0.00   
##  Tuesday  :5440   1st Qu.: 5908   1st Qu.: 3.91   1st Qu.:  0.00   
##  Wednesday:5440   Median :10320   Median : 6.82   Median : 18.00   
##  Thursday :5414   Mean   : 9657   Mean   : 6.49   Mean   : 23.73   
##  Friday   :4352   3rd Qu.:12207   3rd Qu.: 8.35   3rd Qu.: 38.00   
##  Saturday :4352   Max.   :20031   Max.   :13.24   Max.   :210.00   
##  Sunday   :4352                                                    
##  FairlyActiveMinutes LightlyActiveMinutes SedentaryMinutes    Calories   
##  Min.   : 0.00       Min.   :  0.0        Min.   :   0.0   Min.   :   0  
##  1st Qu.: 4.00       1st Qu.:199.0        1st Qu.: 634.0   1st Qu.:1850  
##  Median :15.00       Median :243.0        Median : 683.0   Median :2039  
##  Mean   :18.32       Mean   :241.5        Mean   : 689.4   Mean   :2011  
##  3rd Qu.:33.00       3rd Qu.:295.0        3rd Qu.: 731.0   3rd Qu.:2173  
##  Max.   :74.00       Max.   :432.0        Max.   :1440.0   Max.   :4552  
##                                                                          
##  TotalMinutesAsleep TotalTimeInBed   WeightPounds  
##  Min.   : 59.0      Min.   : 65.0   Min.   :116.0  
##  1st Qu.:411.0      1st Qu.:424.0   1st Qu.:134.9  
##  Median :442.0      Median :457.0   Median :135.6  
##  Mean   :437.5      Mean   :456.3   Mean   :138.6  
##  3rd Qu.:476.0      3rd Qu.:497.0   3rd Qu.:136.5  
##  Max.   :750.0      Max.   :775.0   Max.   :294.3  
## 

Active Minutes Analysis

Percentage of active minutes in the four categories: very active, fairly active, lightly active and sedentary. From the pie chart, we can see that most users spent 81.3% of their daily activity in sedentary minutes and only 1.74% in very active minutes.

A/c to American Health Association daily goal of fairly_active_minutes = 21.4 or 10.7 minutes of Very_Active_Minutes.

## # A tibble: 6 × 2
## # Groups:   Id [6]
##           Id     n
##        <dbl> <int>
## 1 1503960366    30
## 2 1624580081     6
## 3 1644430081    16
## 4 1927972279     2
## 5 2022484408    28
## 6 2320127002     2

Noticeable day

The bar graph shows that there is a jump on Saturday user spent LESS time in sedentary minutes and take MORE steps. Users are out and about on Saturday.

Hourly Steps

From 5PM to 7PM the users take the most steps.

##           Id        ActivityHour StepTotal Hour
## 1 1503960366 2016-04-12 00:00:00       373   00
## 2 1503960366 2016-04-12 01:00:00       160   01
## 3 1503960366 2016-04-12 02:00:00       151   02
## 4 1503960366 2016-04-12 03:00:00         0   03
## 5 1503960366 2016-04-12 04:00:00         0   04
## 6 1503960366 2016-04-12 05:00:00         0   05

Weekly steps

Tuesday and Saturdays the users take the most steps.

Total Steps Vs. Calories

Here we see that some users who are sedentary, take minimal steps, but still able to burn over 1500 to 2500 calories compare to users who are more active, take more steps, but still burn similar calories.

## `geom_smooth()` using formula 'y ~ x'

## Steps Vs. Active Minutes

Comparing the four active levels to the total steps, we see most data is concentrated on users who take about 5000 to 15000 steps a day. These users spent an average between 8 to 13 hours in sedentary, 5 hours in lightly active, and 1 to 2 hour for fairly and very active.

Sleep Vs. Calories Burnt

Do people sleep more burn less calories? Plotting the two variables we can see that there is not much a correlation.

## `geom_smooth()` using formula 'y ~ x'